Sparse Learning Based Linear Coherent Bi-clustering

نویسندگان

Yi Shi

Xiaoping Liao

Xinhua Zhang

Guohui Lin

Dale Schuurmans

چکیده

Clustering algorithms are often limited by an assumption that each data point belongs to a single class, and furthermore that all features of a data point are relevant to class determination. Such assumptions are inappropriate in applications such as gene clustering, where, given expression profile data, genes may exhibit similar behaviors only under some, but not all conditions, and genes may participate in more than one functional process and hence belong to multiple groups. Identifying genes that have similar expression patterns in a common subset of conditions is a central problem in gene expression microarray analysis. To overcome the limitations of standard clustering methods for this purpose, Bi-clustering has often been proposed as an alternative approach, where one seeks groups of observations that exhibit similar patterns over a subset of the features. In this paper, we propose a new bi-clustering algorithm for identifying linear-coherent bi-clusters in gene expression data, strictly generalizing the type of bi-cluster structure considered by other methods. Our algorithm is based on recent sparse learning techniques that have gained significant attention in the machine learning research community. In this work, we propose a novel sparse learning based model, SLLB, for solving the linear coherent bi-clustering problem. Experiments on both synthetic data and real gene expression data demonstrate the model is significantly more effective than current biclustering algorithms for these problems. The parameter selection problem and the model’s usefulness in other machine learning clustering applications are also discussed. The on-line appendix for this paper can be found at http://www.cs.ualberta.ca/~ys3/SLLB.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linear Coherent Bi-cluster Discovery via Beam Detection and Sample Set Clustering

We propose a new bi-clustering algorithm, LinCoh, for finding linear coherent bi-clusters in gene expression microarray data. Our method exploits a robust technique for identifying conditionally correlated genes, combined with an efficient density based search for clustering sample sets. Experimental results on both synthetic and real datasets demonstrated that LinCoh consistently finds more ac...

متن کامل

Linear Coherent Bi-Clustering via Beam Searching and Sample Set Clustering

متن کامل

Linear Coherent Bi-cluster Discovery via Line Detection and Sample Majority Voting

Discovering groups of genes that share common expression profiles is an important problem in DNA microarray analysis. Unfortunately, standard bi-clustering algorithms often fail to retrieve common expression groups because (1) genes only exhibit similar behaviors over a subset of conditions, and (2) genes may participate in more than one functional process and therefore belong to multiple group...

متن کامل

Greedy Minimization of Weakly Supermodular Set Functions

This paper defines weak-α-supermodularity for set functions. It shows that minimizing such functions under cardinality constrains is a common task in machine learning and data mining. Moreover, any problem whose objective function exhibits this property benefits from a greedy extension phase. Explicitly, let S∗ be the optimal set of cardinality k that minimizes f and let S0 be an initial soluti...

متن کامل

Bian, Xiao. Sparse and Low-rank Modeling on High Dimensional Data: a Geometric Perspective. (under the Direction of Dr. Hamid Krim.) Sparse and Low-rank Modeling on High Dimensional Data: a Geometric Perspective

BIAN, XIAO. Sparse and Low-Rank Modeling on High Dimensional Data: A Geometric Perspective. (Under the direction of Dr. Hamid Krim.) High dimensional data exhibits distinct properties compared to its low dimensional counterpart, which causes a common performance decrease and a formidable computational cost increase of traditional approaches. Novel methodologies are therefore needed to character...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Sparse Learning Based Linear Coherent Bi-clustering

نویسندگان

چکیده

منابع مشابه

Linear Coherent Bi-cluster Discovery via Beam Detection and Sample Set Clustering

Linear Coherent Bi-Clustering via Beam Searching and Sample Set Clustering

Linear Coherent Bi-cluster Discovery via Line Detection and Sample Majority Voting

Greedy Minimization of Weakly Supermodular Set Functions

Bian, Xiao. Sparse and Low-rank Modeling on High Dimensional Data: a Geometric Perspective. (under the Direction of Dr. Hamid Krim.) Sparse and Low-rank Modeling on High Dimensional Data: a Geometric Perspective

عنوان ژورنال:

اشتراک گذاری